智能论文笔记

Portuguese Man-of-War Image Classification with Convolutional Neural Networks

Alessandra Carneiro , Lorena Nascimento , Mauricio Noernberg , Carmem Hara , Aurora Pozo

分类：计算机视觉 | 人工智能 | 机器学习

2022-07-04

葡萄牙人战士（PMW）是一种凝胶生物体，具有长长的触手，能够造成严重的燃烧，从而导致对人类活动（例如旅游和捕鱼）的负面影响。缺乏有关该物种的时空动力学的信息。因此，使用替代方法收集数据可以有助于其监视。鉴于社交网络的广泛使用和PMW的引人注目的外观，Instagram帖子可能是监视的有前途的数据源。遵循此方法的第一个任务是识别指向PMW的帖子。本文报告了使用卷积神经网络进行PMW图像分类，以自动识别Instagram帖子。我们创建了一个合适的数据集，并训练了三个不同的神经网络：VGG-16，RESNET50和InceptionV3，并在Imagenet数据集中进行了预先训练的步骤。我们使用准确性，精度，召回和F1评分指标分析了他们的结果。预先训练的RESNET50网络提供了最佳结果，获得了94％的精度和95％的精度，召回和F1分数。这些结果表明，卷积神经网络对于识别Instagram社交媒体的PMW图像非常有效。

translated by 谷歌翻译

Asymmetric Co-teaching with Multi-view Consensus for Noisy Label Learning

Fengbei Liu , Yuanhong Chen , Chong Wang , Yu Tain , Gustavo Carneiro

分类：计算机视觉

2023-01-01

Learning with noisy-labels has become an important research topic in computer vision where state-of-the-art (SOTA) methods explore: 1) prediction disagreement with co-teaching strategy that updates two models when they disagree on the prediction of training samples; and 2) sample selection to divide the training set into clean and noisy sets based on small training loss. However, the quick convergence of co-teaching models to select the same clean subsets combined with relatively fast overfitting of noisy labels may induce the wrong selection of noisy label samples as clean, leading to an inevitable confirmation bias that damages accuracy. In this paper, we introduce our noisy-label learning approach, called Asymmetric Co-teaching (AsyCo), which introduces novel prediction disagreement that produces more consistent divergent results of the co-teaching models, and a new sample selection approach that does not require small-loss assumption to enable a better robustness to confirmation bias than previous methods. More specifically, the new prediction disagreement is achieved with the use of different training strategies, where one model is trained with multi-class learning and the other with multi-label learning. Also, the new sample selection is based on multi-view consensus, which uses the label views from training labels and model predictions to divide the training set into clean and noisy for training the multi-class model and to re-label the training samples with multiple top-ranked labels for training the multi-label model. Extensive experiments on synthetic and real-world noisy-label datasets show that AsyCo improves over current SOTA methods.

translated by 谷歌翻译

The Quantum Path Kernel: a Generalized Quantum Neural Tangent Kernel for Deep Quantum Machine Learning

Massimiliano Incudini , Michele Grossi , Antonio Mandarino , Sofia Vallecorsa , Alessandra Di Pierro , David Windridge

分类：机器学习

2022-12-22

Building a quantum analog of classical deep neural networks represents a fundamental challenge in quantum computing. A key issue is how to address the inherent non-linearity of classical deep learning, a problem in the quantum domain due to the fact that the composition of an arbitrary number of quantum gates, consisting of a series of sequential unitary transformations, is intrinsically linear. This problem has been variously approached in the literature, principally via the introduction of measurements between layers of unitary transformations. In this paper, we introduce the Quantum Path Kernel, a formulation of quantum machine learning capable of replicating those aspects of deep machine learning typically associated with superior generalization performance in the classical domain, specifically, hierarchical feature learning. Our approach generalizes the notion of Quantum Neural Tangent Kernel, which has been used to study the dynamics of classical and quantum machine learning models. The Quantum Path Kernel exploits the parameter trajectory, i.e. the curve delineated by model parameters as they evolve during training, enabling the representation of differential layer-wise convergence behaviors, or the formation of hierarchical parametric dependencies, in terms of their manifestation in the gradient space of the predictor function. We evaluate our approach with respect to variants of the classification of Gaussian XOR mixtures - an artificial but emblematic problem that intrinsically requires multilevel learning in order to achieve optimal class separation.

translated by 谷歌翻译

An adaptive human-in-the-loop approach to emission detection of Additive Manufacturing processes and active learning with computer vision

Xiao Liu , Alan F. Smeaton , Alessandra Mileo

分类：机器学习 | 人工智能 | 计算机视觉

2022-12-12

Recent developments in in-situ monitoring and process control in Additive Manufacturing (AM), also known as 3D-printing, allows the collection of large amounts of emission data during the build process of the parts being manufactured. This data can be used as input into 3D and 2D representations of the 3D-printed parts. However the analysis and use, as well as the characterization of this data still remains a manual process. The aim of this paper is to propose an adaptive human-in-the-loop approach using Machine Learning techniques that automatically inspect and annotate the emissions data generated during the AM process. More specifically, this paper will look at two scenarios: firstly, using convolutional neural networks (CNNs) to automatically inspect and classify emission data collected by in-situ monitoring and secondly, applying Active Learning techniques to the developed classification model to construct a human-in-the-loop mechanism in order to accelerate the labeling process of the emission data. The CNN-based approach relies on transfer learning and fine-tuning, which makes the approach applicable to other industrial image patterns. The adaptive nature of the approach is enabled by uncertainty sampling strategy to automatic selection of samples to be presented to human experts for annotation.

translated by 谷歌翻译

Whose Emotion Matters? Speaker Detection without Prior Knowledge

Hugo Carneiro , Cornelius Weber , Stefan Wermter

分类：计算机视觉 | 机器学习 | 神经与进化计算

2022-11-23

The task of emotion recognition in conversations (ERC) benefits from the availability of multiple modalities, as offered, for example, in the video-based MELD dataset. However, only a few research approaches use both acoustic and visual information from the MELD videos. There are two reasons for this: First, label-to-video alignments in MELD are noisy, making those videos an unreliable source of emotional speech data. Second, conversations can involve several people in the same scene, which requires the detection of the person speaking the utterance. In this paper we demonstrate that by using recent automatic speech recognition and active speaker detection models, we are able to realign the videos of MELD, and capture the facial expressions from uttering speakers in 96.92% of the utterances provided in MELD. Experiments with a self-supervised voice recognition model indicate that the realigned MELD videos more closely match the corresponding utterances offered in the dataset. Finally, we devise a model for emotion recognition in conversations trained on the face and audio information of the MELD realigned videos, which outperforms state-of-the-art models for ERC based on vision alone. This indicates that active speaker detection is indeed effective for extracting facial expressions from the uttering speakers, and that faces provide more informative visual cues than the visual features state-of-the-art models have been using so far.

translated by 谷歌翻译

Knowledge Distillation to Ensemble Global and Interpretable Prototype-Based Mammogram Classification Models

Chong Wang , Yuanhong Chen , Yuyuan Liu , Yu Tian , Fengbei Liu , Davis J. McCarthy , Michael Elliott , Helen Frazer , Gustavo Carneiro

分类：计算机视觉

2022-09-26

最先进的（SOTA）深度学习乳房X线照片分类器接受了弱标记的图像训练，通常依赖于产生有限解释性预测的全球模型，这是他们成功地转化为临床实践的关键障碍。另一方面，基于原型的模型通过将预测与训练图像原型相关联，改善了可解释性，但是它们的准确性不如全球模型，其原型往往具有差的多样性。我们通过BraixProtopnet ++的建议解决了这两个问题，该问题通过将基于原型的模型结合起来，为全局模型增添了解释性。 BraixProtopnet ++在训练基于原型的模型以提高合奏的分类精度时，会提炼全局模型的知识。此外，我们提出了一种方法来通过保证所有原型都与不同的训练图像相关联，以增加原型多样性。对弱标记的私人和公共数据集进行的实验表明，BraixProtopnet ++的分类精度比基于SOTA Global和基于原型的模型具有更高的分类精度。使用病变定位来评估模型可解释性，我们显示BraixProtopnet ++比其他基于原型的模型和全球模型的事后解释更有效。最后，我们表明，BraixProtopnet ++学到的原型的多样性优于基于SOTA原型的方法。

translated by 谷歌翻译

Structure Learning of Quantum Embeddings

Massimiliano Incudini , Francesco Martini , Alessandra Di Pierro

分类：机器学习

2022-09-22

数据的表示对于机器学习方法至关重要。内核方法用于丰富特征表示，从而可以更好地概括。量子内核有效地实施了在量子系统的希尔伯特空间中编码经典数据的有效复杂的转换，甚至导致指数加速。但是，我们需要对数据的先验知识来选择可以用作量子嵌入的适当参数量子电路。我们提出了一种算法，该算法通过组合优化过程自动选择最佳的量子嵌入过程，该过程修改了电路的结构，更改门的发生器，其角度（取决于数据点）以及各种门的QUBIT行为。由于组合优化在计算上是昂贵的，因此我们基于均值周围的核基质系数的指数浓度引入了一个标准，以立即丢弃任意大部分的溶液，这些溶液被认为性能较差。与基于梯度的优化（例如可训练的量子内核）相反，我们的方法不受建筑贫瘠的高原影响。我们已经使用人工和现实数据集来证明相对于随机生成的PQC的方法的提高。我们还比较了不同优化算法的效果，包括贪婪的局部搜索，模拟退火和遗传算法，表明算法选择在很大程度上影响了结果。

translated by 谷歌翻译

Multi-view Local Co-occurrence and Global Consistency Learning Improve Mammogram Classification Generalisation

Yuanhong Chen , Hu Wang , Chong Wang , Yu Tian , Fengbei Liu , Michael Elliott , Davis J. McCarthy , Helen Frazer , Gustavo Carneiro

分类：计算机视觉

2022-09-21

在分析筛查乳房X线照片时，放射科医生可以自然处理每个乳房的两个同侧视图，即颅底审计（CC）和中外侧 - 粘合剂（MLO）视图。这些多个相关图像提供了互补的诊断信息，并可以提高放射科医生的分类准确性。不幸的是，大多数现有的深度学习系统，受过全球标记的图像培训，缺乏从这些多种观点中共同分析和整合全球和本地信息的能力。通过忽略筛选发作的多个图像中存在的潜在有价值的信息，人们限制了这些系统的潜在准确性。在这里，我们提出了一种新的多视图全球分析方法，该方法基于全球一致性学习和对乳房X线照片中同侧观点的局部同时学习，模仿放射科医生的阅读程序。广泛的实验表明，在大规模的私人数据集和两个公开可用的数据集上，我们的模型在分类准确性和概括方面优于竞争方法，在该数据集和两个公开可用的数据集上，模型仅受到全球标签的培训和测试。

translated by 谷歌翻译

On the Optimal Combination of Cross-Entropy and Soft Dice Losses for Lesion Segmentation with Out-of-Distribution Robustness

Adrian Galdran , Gustavo Carneiro , Miguel Ángel González Ballester

分类：计算机视觉

2022-09-13

我们研究不同损失功能对医学图像病变细分的影响。尽管在处理自然图像时，跨凝结（CE）损失是最受欢迎的选择，但对于生物医学图像分割，由于其处理不平衡的情况，软骰子损失通常是首选的。另一方面，这两个功能的组合也已成功地应用于此类任务中。一个较少研究的问题是在存在分布（OOD）数据的情况下所有这些损失的概括能力。这是指在测试时间出现的样本，这些样本是从与训练图像不同的分布中得出的。在我们的情况下，我们将模型训练在始终包含病变的图像上，但是在测试时间我们也有无病变样品。我们通过全面的实验对内窥镜图像和糖尿病脚图像的溃疡分割进行了全面的实验，分析了不同损失函数对分布性能的最小化对分布性能的影响。我们的发现令人惊讶：在处理OOD数据时，CE-DICE损失组合在分割分配图像中表现出色，这使我们建议通过这种问题采用CE损失，因为它的稳健性和能够概括为OOD样品。可以在\ url {https://github.com/agaldran/lesion_losses_ood}找到与我们实验相关的代码。

translated by 谷歌翻译

"iCub, We Forgive You!" Investigating Trust in a Game Scenario with Kids

Francesca Cocchella , Giulia Pusceddu , Giulia Belgiovine , Linda Lastrico , Francesco Rea , Alessandra Sciutti

分类：机器人

2022-09-04

这项研究提出了新的策略，以研究信任和群体动态在儿童机器人相互作用中的相互影响。我们使用类人机器人ICUB实施了类似游戏的实验活动，并设计了一份问卷来评估孩子如何看待这种相互作用。我们还旨在验证传感器，设置和任务是否适合研究此类方面。问卷的结果表明，年轻人将ICUB视为朋友，通常以积极的方式将ICUB视为朋友。其他初步结果表明，通常，孩子在活动期间信任ICUB，并且在其错误后，他们试图用诸如：“不用担心ICUB，我们原谅您”之类的句子来放心。此外，对机器人在小组认知活动中的信任似乎会根据性别而发生变化：在机器人连续两个错误之后，女孩倾向于比男孩更信任ICUB。最后，跨游戏计算的点和自我报告的量表之间的不同年龄组之间没有明显的差异。我们提出的工具适合研究不同年龄段的人类机器人相互作用（HRI）的信任，并且似乎适合理解小组相互作用的信任机制。

translated by 谷歌翻译